A novel hybrid ultrafast shape descriptor method for use in virtual screening

نویسندگان

  • Edward O Cannon
  • Florian Nigsch
  • John BO Mitchell
چکیده

BACKGROUND We have introduced a new Hybrid descriptor composed of the MACCS key descriptor encoding topological information and Ballester and Richards' Ultrafast Shape Recognition (USR) descriptor. The latter one is calculated from the moments of the distribution of the interatomic distances, and in this work we also included higher moments than in the original implementation. RESULTS The performance of this Hybrid descriptor is assessed using Random Forest and a dataset of 116,476 molecules. Our dataset includes 5,245 molecules in ten classes from the 2005 World Anti-Doping Agency (WADA) dataset and 111,231 molecules from the National Cancer Institute (NCI) database. In a 10-fold Monte Carlo cross-validation this dataset was partitioned into three distinct parts for training, optimisation of an internal threshold that we introduced, and validation of the resulting model. The standard errors obtained were used to assess statistical significance of observed improvements in performance of our new descriptor. CONCLUSION The Hybrid descriptor was compared to the MACCS key descriptor, USR with the first three (USR), four (UF4) and five (UF5) moments, and a combination of MACCS with USR (three moments). The MACCS key descriptor was not combined with UF5, due to similar performance of UF5 and UF4. Superior performance in terms of all figures of merit was found for the MACCS/UF4 Hybrid descriptor with respect to all other descriptors examined. These figures of merit include recall in the top 1% and top 5% of the ranked validation sets, precision, F-measure, area under the Receiver Operating Characteristic curve and Matthews Correlation Coefficient.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

USRCAT: real-time ultrafast shape recognition with pharmacophoric constraints

UNLABELLED BACKGROUND Ligand-based virtual screening using molecular shape is an important tool for researchers who wish to find novel chemical scaffolds in compound libraries. The Ultrafast Shape Recognition (USR) algorithm is capable of screening millions of compounds and is therefore suitable for usage in a web service. The algorithm however is agnostic of atom types and cannot discrimina...

متن کامل

بازیابی مبتنی بر شکل اجسام با توصیفگرهای بدست آمده از فرآیند رشد کانتوری

In this paper, a novel shape descriptor for shape-based object retrieval is proposed. A growing process is introduced in which a contour is reconstructed from the bounding circle of the shape. In this growing process, circle points move toward the shape in normal direction until they  get to the shape contour. Three different shape descriptors are extracted from this process: the first descript...

متن کامل

SHAFTS: A Hybrid Approach for 3D Molecular Similarity Calculation. 1. Method and Assessment of Virtual Screening

We developed a novel approach called SHAFTS (SHApe-FeaTure Similarity) for 3D molecular similarity calculation and ligand-based virtual screening. SHAFTS adopts a hybrid similarity metric combined with molecular shape and colored (labeled) chemistry groups annotated by pharmacophore features for 3D similarity calculation and ranking, which is designed to integrate the strength of pharmacophore ...

متن کامل

Application of 3D Zernike descriptors to shape-based ligand similarity searching

BACKGROUND The identification of promising drug leads from a large database of compounds is an important step in the preliminary stages of drug design. Although shape is known to play a key role in the molecular recognition process, its application to virtual screening poses significant hurdles both in terms of the encoding scheme and speed. RESULTS In this study, we have examined the efficac...

متن کامل

Spherical Harmonics Coefficients for Ligand-Based Virtual Screening of Cyclooxygenase Inhibitors

BACKGROUND Molecular descriptors are essential for many applications in computational chemistry, such as ligand-based similarity searching. Spherical harmonics have previously been suggested as comprehensive descriptors of molecular structure and properties. We investigate a spherical harmonics descriptor for shape-based virtual screening. METHODOLOGY/PRINCIPAL FINDINGS We introduce and valid...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Chemistry Central Journal

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2008